Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 7111 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 666.8 KiB |
| Average record size in memory | 96.0 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 11 |
deg_C is highly correlated with relative_humidity | High correlation |
relative_humidity is highly correlated with deg_C | High correlation |
absolute_humidity is highly correlated with sensor_4 | High correlation |
sensor_1 is highly correlated with sensor_2 and 6 other fields | High correlation |
sensor_2 is highly correlated with sensor_1 and 6 other fields | High correlation |
sensor_3 is highly correlated with sensor_1 and 6 other fields | High correlation |
sensor_4 is highly correlated with absolute_humidity and 6 other fields | High correlation |
sensor_5 is highly correlated with sensor_1 and 6 other fields | High correlation |
target_carbon_monoxide is highly correlated with sensor_1 and 6 other fields | High correlation |
target_benzene is highly correlated with sensor_1 and 6 other fields | High correlation |
target_nitrogen_oxides is highly correlated with sensor_1 and 5 other fields | High correlation |
deg_C is highly correlated with relative_humidity | High correlation |
relative_humidity is highly correlated with deg_C | High correlation |
absolute_humidity is highly correlated with sensor_4 | High correlation |
sensor_1 is highly correlated with sensor_2 and 6 other fields | High correlation |
sensor_2 is highly correlated with sensor_1 and 6 other fields | High correlation |
sensor_3 is highly correlated with sensor_1 and 5 other fields | High correlation |
sensor_4 is highly correlated with absolute_humidity and 6 other fields | High correlation |
sensor_5 is highly correlated with sensor_1 and 6 other fields | High correlation |
target_carbon_monoxide is highly correlated with sensor_1 and 6 other fields | High correlation |
target_benzene is highly correlated with sensor_1 and 6 other fields | High correlation |
target_nitrogen_oxides is highly correlated with sensor_1 and 4 other fields | High correlation |
sensor_1 is highly correlated with sensor_2 and 5 other fields | High correlation |
sensor_2 is highly correlated with sensor_1 and 6 other fields | High correlation |
sensor_3 is highly correlated with sensor_1 and 6 other fields | High correlation |
sensor_4 is highly correlated with sensor_2 and 2 other fields | High correlation |
sensor_5 is highly correlated with sensor_1 and 5 other fields | High correlation |
target_carbon_monoxide is highly correlated with sensor_1 and 5 other fields | High correlation |
target_benzene is highly correlated with sensor_1 and 6 other fields | High correlation |
target_nitrogen_oxides is highly correlated with sensor_1 and 5 other fields | High correlation |
deg_C is highly correlated with relative_humidity and 2 other fields | High correlation |
relative_humidity is highly correlated with deg_C | High correlation |
absolute_humidity is highly correlated with deg_C and 3 other fields | High correlation |
sensor_1 is highly correlated with sensor_2 and 6 other fields | High correlation |
sensor_2 is highly correlated with absolute_humidity and 7 other fields | High correlation |
sensor_3 is highly correlated with absolute_humidity and 7 other fields | High correlation |
sensor_4 is highly correlated with deg_C and 8 other fields | High correlation |
sensor_5 is highly correlated with sensor_1 and 6 other fields | High correlation |
target_carbon_monoxide is highly correlated with sensor_1 and 6 other fields | High correlation |
target_benzene is highly correlated with sensor_1 and 6 other fields | High correlation |
target_nitrogen_oxides is highly correlated with sensor_1 and 6 other fields | High correlation |
date_time has unique values | Unique |
Reproduction
| Analysis started | 2021-10-05 10:15:37.329409 |
|---|---|
| Analysis finished | 2021-10-05 10:15:53.865165 |
| Duration | 16.54 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 7111 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 55.7 KiB |
| Minimum | 2010-03-10 18:00:00 |
|---|---|
| Maximum | 2011-01-01 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 408 |
|---|---|
| Distinct (%) | 5.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.87803403 |
| Minimum | 1.3 |
|---|---|
| Maximum | 46.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 1.3 |
|---|---|
| 5-th percentile | 8.8 |
| Q1 | 14.9 |
| median | 20.7 |
| Q3 | 25.8 |
| 95-th percentile | 35.6 |
| Maximum | 46.1 |
| Range | 44.8 |
| Interquartile range (IQR) | 10.9 |
Descriptive statistics
| Standard deviation | 7.937916707 |
|---|---|
| Coefficient of variation (CV) | 0.380204223 |
| Kurtosis | -0.3130863355 |
| Mean | 20.87803403 |
| Median Absolute Deviation (MAD) | 5.5 |
| Skewness | 0.2903180452 |
| Sum | 148463.7 |
| Variance | 63.01052165 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 21 | 56 | 0.8% |
| 20.4 | 50 | 0.7% |
| 25.4 | 50 | 0.7% |
| 22.5 | 49 | 0.7% |
| 25 | 48 | 0.7% |
| 21.3 | 47 | 0.7% |
| 24.5 | 46 | 0.6% |
| 24.3 | 45 | 0.6% |
| 22.8 | 42 | 0.6% |
| 23.8 | 42 | 0.6% |
| Other values (398) | 6636 |
| Value | Count | Frequency (%) |
| 1.3 | 2 | |
| 1.4 | 1 | |
| 1.5 | 1 | |
| 1.7 | 1 | |
| 2.2 | 1 | |
| 2.3 | 2 | |
| 2.5 | 2 | |
| 2.6 | 1 | |
| 2.9 | 1 | |
| 3 | 1 |
| Value | Count | Frequency (%) |
| 46.1 | 1 | < 0.1% |
| 45.3 | 1 | < 0.1% |
| 45.2 | 1 | < 0.1% |
| 44.1 | 1 | < 0.1% |
| 43.2 | 1 | < 0.1% |
| 43.1 | 1 | < 0.1% |
| 43 | 3 | |
| 42.9 | 1 | < 0.1% |
| 42.8 | 1 | < 0.1% |
| 42.7 | 1 | < 0.1% |
| Distinct | 762 |
|---|---|
| Distinct (%) | 10.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.56100408 |
| Minimum | 8.9 |
|---|---|
| Maximum | 90.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 8.9 |
|---|---|
| 5-th percentile | 19.8 |
| Q1 | 33.7 |
| median | 47.3 |
| Q3 | 60.8 |
| 95-th percentile | 77.05 |
| Maximum | 90.8 |
| Range | 81.9 |
| Interquartile range (IQR) | 27.1 |
Descriptive statistics
| Standard deviation | 17.39873072 |
|---|---|
| Coefficient of variation (CV) | 0.3658192475 |
| Kurtosis | -0.8188864778 |
| Mean | 47.56100408 |
| Median Absolute Deviation (MAD) | 13.6 |
| Skewness | 0.07862123326 |
| Sum | 338206.3 |
| Variance | 302.7158307 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 54.5 | 25 | 0.4% |
| 50.6 | 24 | 0.3% |
| 39 | 23 | 0.3% |
| 34.6 | 23 | 0.3% |
| 47.3 | 23 | 0.3% |
| 43.6 | 23 | 0.3% |
| 46.9 | 21 | 0.3% |
| 51.3 | 20 | 0.3% |
| 46.4 | 20 | 0.3% |
| 31.1 | 20 | 0.3% |
| Other values (752) | 6889 |
| Value | Count | Frequency (%) |
| 8.9 | 1 | |
| 9 | 1 | |
| 9.2 | 1 | |
| 9.3 | 1 | |
| 9.5 | 1 | |
| 9.8 | 1 | |
| 10.3 | 1 | |
| 10.4 | 2 | |
| 10.5 | 1 | |
| 11.1 | 1 |
| Value | Count | Frequency (%) |
| 90.8 | 2 | |
| 89.6 | 1 | |
| 89.1 | 2 | |
| 88.7 | 1 | |
| 88.6 | 1 | |
| 88 | 2 | |
| 87.8 | 1 | |
| 87.5 | 2 | |
| 87.3 | 2 | |
| 86.9 | 1 |
| Distinct | 5451 |
|---|---|
| Distinct (%) | 76.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.110308761 |
| Minimum | 0.1988 |
|---|---|
| Maximum | 2.231 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 0.1988 |
|---|---|
| 5-th percentile | 0.42105 |
| Q1 | 0.8559 |
| median | 1.0835 |
| Q3 | 1.40415 |
| 95-th percentile | 1.75615 |
| Maximum | 2.231 |
| Range | 2.0322 |
| Interquartile range (IQR) | 0.54825 |
Descriptive statistics
| Standard deviation | 0.3989500846 |
|---|---|
| Coefficient of variation (CV) | 0.3593145426 |
| Kurtosis | -0.3195498508 |
| Mean | 1.110308761 |
| Median Absolute Deviation (MAD) | 0.267 |
| Skewness | -0.03589043194 |
| Sum | 7895.4056 |
| Variance | 0.15916117 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.2258 | 9 | 0.1% |
| 0.2257 | 9 | 0.1% |
| 0.2256 | 9 | 0.1% |
| 0.2263 | 6 | 0.1% |
| 1.1199 | 6 | 0.1% |
| 0.2262 | 6 | 0.1% |
| 0.228 | 5 | 0.1% |
| 0.2261 | 5 | 0.1% |
| 0.2259 | 5 | 0.1% |
| 0.2254 | 5 | 0.1% |
| Other values (5441) | 7046 |
| Value | Count | Frequency (%) |
| 0.1988 | 1 | |
| 0.2029 | 1 | |
| 0.2136 | 1 | |
| 0.2146 | 1 | |
| 0.2148 | 1 | |
| 0.2163 | 1 | |
| 0.2167 | 1 | |
| 0.2169 | 1 | |
| 0.217 | 1 | |
| 0.218 | 1 |
| Value | Count | Frequency (%) |
| 2.231 | 1 | |
| 2.1806 | 1 | |
| 2.1766 | 1 | |
| 2.1719 | 1 | |
| 2.1395 | 1 | |
| 2.1362 | 1 | |
| 2.1247 | 1 | |
| 2.1195 | 1 | |
| 2.117 | 1 | |
| 2.1164 | 1 |
| Distinct | 3882 |
|---|---|
| Distinct (%) | 54.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1091.5721 |
| Minimum | 620.3 |
|---|---|
| Maximum | 2088.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 620.3 |
|---|---|
| 5-th percentile | 796.85 |
| Q1 | 930.25 |
| median | 1060.5 |
| Q3 | 1215.8 |
| 95-th percentile | 1502.55 |
| Maximum | 2088.3 |
| Range | 1468 |
| Interquartile range (IQR) | 285.55 |
Descriptive statistics
| Standard deviation | 218.5375542 |
|---|---|
| Coefficient of variation (CV) | 0.2002044155 |
| Kurtosis | 0.6151993578 |
| Mean | 1091.5721 |
| Median Absolute Deviation (MAD) | 137.8 |
| Skewness | 0.7959241429 |
| Sum | 7762169.2 |
| Variance | 47758.66258 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1007 | 12 | 0.2% |
| 1054.7 | 10 | 0.1% |
| 930 | 10 | 0.1% |
| 982.6 | 9 | 0.1% |
| 1050 | 9 | 0.1% |
| 951.4 | 8 | 0.1% |
| 980.7 | 8 | 0.1% |
| 927.2 | 8 | 0.1% |
| 970 | 8 | 0.1% |
| 1150.4 | 8 | 0.1% |
| Other values (3872) | 7021 |
| Value | Count | Frequency (%) |
| 620.3 | 1 | |
| 633.3 | 1 | |
| 634.1 | 1 | |
| 635.2 | 1 | |
| 635.4 | 1 | |
| 640.3 | 1 | |
| 655.6 | 1 | |
| 660.3 | 1 | |
| 664.3 | 2 | |
| 665.5 | 1 |
| Value | Count | Frequency (%) |
| 2088.3 | 1 | |
| 2022.5 | 1 | |
| 2021.6 | 1 | |
| 1999.2 | 1 | |
| 1963.1 | 1 | |
| 1957.3 | 1 | |
| 1956 | 1 | |
| 1953.6 | 1 | |
| 1953.3 | 1 | |
| 1950 | 1 |
| Distinct | 4254 |
|---|---|
| Distinct (%) | 59.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 938.0649698 |
| Minimum | 364 |
|---|---|
| Maximum | 2302.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 364 |
|---|---|
| 5-th percentile | 512.6 |
| Q1 | 734.9 |
| median | 914.2 |
| Q3 | 1124.1 |
| 95-th percentile | 1434.5 |
| Maximum | 2302.6 |
| Range | 1938.6 |
| Interquartile range (IQR) | 389.2 |
Descriptive statistics
| Standard deviation | 281.9789878 |
|---|---|
| Coefficient of variation (CV) | 0.3005964373 |
| Kurtosis | 0.1166562533 |
| Mean | 938.0649698 |
| Median Absolute Deviation (MAD) | 192.5 |
| Skewness | 0.4178397604 |
| Sum | 6670580 |
| Variance | 79512.14959 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 373.7 | 12 | 0.2% |
| 401.3 | 11 | 0.2% |
| 365.9 | 10 | 0.1% |
| 377.7 | 9 | 0.1% |
| 417 | 9 | 0.1% |
| 393.4 | 8 | 0.1% |
| 946.6 | 8 | 0.1% |
| 1081.6 | 7 | 0.1% |
| 1074.2 | 7 | 0.1% |
| 413.1 | 7 | 0.1% |
| Other values (4244) | 7023 |
| Value | Count | Frequency (%) |
| 364 | 1 | < 0.1% |
| 364.8 | 2 | < 0.1% |
| 364.9 | 1 | < 0.1% |
| 365.1 | 1 | < 0.1% |
| 365.3 | 1 | < 0.1% |
| 365.8 | 3 | < 0.1% |
| 365.9 | 10 | |
| 366.1 | 1 | < 0.1% |
| 368.8 | 1 | < 0.1% |
| 369 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2302.6 | 1 | |
| 2079 | 1 | |
| 2057 | 1 | |
| 1966.9 | 1 | |
| 1941.4 | 1 | |
| 1938.7 | 1 | |
| 1923.5 | 1 | |
| 1913.3 | 1 | |
| 1907.9 | 1 | |
| 1903.3 | 1 |
| Distinct | 4251 |
|---|---|
| Distinct (%) | 59.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 883.9033047 |
| Minimum | 310.6 |
|---|---|
| Maximum | 2567.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 310.6 |
|---|---|
| 5-th percentile | 496.75 |
| Q1 | 681.05 |
| median | 827.8 |
| Q3 | 1008.85 |
| 95-th percentile | 1537.1 |
| Maximum | 2567.4 |
| Range | 2256.8 |
| Interquartile range (IQR) | 327.8 |
Descriptive statistics
| Standard deviation | 310.4563551 |
|---|---|
| Coefficient of variation (CV) | 0.3512333911 |
| Kurtosis | 2.61958281 |
| Mean | 883.9033047 |
| Median Absolute Deviation (MAD) | 159.9 |
| Skewness | 1.407229292 |
| Sum | 6285436.4 |
| Variance | 96383.14843 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 832.3 | 9 | 0.1% |
| 776.2 | 9 | 0.1% |
| 831 | 8 | 0.1% |
| 740 | 8 | 0.1% |
| 890 | 8 | 0.1% |
| 814 | 7 | 0.1% |
| 816 | 7 | 0.1% |
| 651 | 7 | 0.1% |
| 1045.4 | 7 | 0.1% |
| 765.4 | 7 | 0.1% |
| Other values (4241) | 7034 |
| Value | Count | Frequency (%) |
| 310.6 | 1 | |
| 317.1 | 1 | |
| 323.4 | 1 | |
| 327.1 | 1 | |
| 328.2 | 1 | |
| 328.4 | 1 | |
| 331.3 | 1 | |
| 331.5 | 1 | |
| 335 | 1 | |
| 340 | 1 |
| Value | Count | Frequency (%) |
| 2567.4 | 1 | |
| 2548.8 | 1 | |
| 2533.4 | 1 | |
| 2466.6 | 1 | |
| 2400.9 | 1 | |
| 2225.3 | 1 | |
| 2225.2 | 1 | |
| 2169.8 | 1 | |
| 2164.2 | 1 | |
| 2121 | 1 |
| Distinct | 4655 |
|---|---|
| Distinct (%) | 65.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1513.238349 |
| Minimum | 552.9 |
|---|---|
| Maximum | 2913.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 552.9 |
|---|---|
| 5-th percentile | 907.35 |
| Q1 | 1320.35 |
| median | 1513.1 |
| Q3 | 1720.4 |
| 95-th percentile | 2083.15 |
| Maximum | 2913.8 |
| Range | 2360.9 |
| Interquartile range (IQR) | 400.05 |
Descriptive statistics
| Standard deviation | 350.18031 |
|---|---|
| Coefficient of variation (CV) | 0.2314112051 |
| Kurtosis | 0.8087355384 |
| Mean | 1513.238349 |
| Median Absolute Deviation (MAD) | 199.6 |
| Skewness | -0.1266305599 |
| Sum | 10760637.9 |
| Variance | 122626.2495 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1458.2 | 9 | 0.1% |
| 1488 | 9 | 0.1% |
| 1455.4 | 8 | 0.1% |
| 1616 | 7 | 0.1% |
| 1612.1 | 7 | 0.1% |
| 1511 | 7 | 0.1% |
| 1426.9 | 7 | 0.1% |
| 1681 | 6 | 0.1% |
| 1277.8 | 6 | 0.1% |
| 1364.2 | 6 | 0.1% |
| Other values (4645) | 7039 |
| Value | Count | Frequency (%) |
| 552.9 | 1 | |
| 553.1 | 1 | |
| 554.2 | 1 | |
| 554.3 | 1 | |
| 557 | 1 | |
| 559.9 | 1 | |
| 561 | 1 | |
| 561.2 | 1 | |
| 562.1 | 1 | |
| 562.6 | 1 |
| Value | Count | Frequency (%) |
| 2913.8 | 1 | |
| 2779.3 | 1 | |
| 2773.5 | 1 | |
| 2746.6 | 1 | |
| 2722.3 | 1 | |
| 2712.5 | 1 | |
| 2690.1 | 1 | |
| 2688.2 | 1 | |
| 2688 | 1 | |
| 2664.1 | 1 |
| Distinct | 4839 |
|---|---|
| Distinct (%) | 68.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 998.3355646 |
| Minimum | 242.7 |
|---|---|
| Maximum | 2594.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 242.7 |
|---|---|
| 5-th percentile | 476.8 |
| Q1 | 722.85 |
| median | 928.7 |
| Q3 | 1224.7 |
| 95-th percentile | 1715.5 |
| Maximum | 2594.6 |
| Range | 2351.9 |
| Interquartile range (IQR) | 501.85 |
Descriptive statistics
| Standard deviation | 381.5376954 |
|---|---|
| Coefficient of variation (CV) | 0.382173799 |
| Kurtosis | 0.4004843033 |
| Mean | 998.3355646 |
| Median Absolute Deviation (MAD) | 241.1 |
| Skewness | 0.7682900442 |
| Sum | 7099164.2 |
| Variance | 145571.013 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 911 | 7 | 0.1% |
| 894 | 7 | 0.1% |
| 658.4 | 6 | 0.1% |
| 884 | 6 | 0.1% |
| 839 | 6 | 0.1% |
| 799 | 6 | 0.1% |
| 944.5 | 6 | 0.1% |
| 724.2 | 6 | 0.1% |
| 812.2 | 6 | 0.1% |
| 930.2 | 6 | 0.1% |
| Other values (4829) | 7049 |
| Value | Count | Frequency (%) |
| 242.7 | 1 | |
| 257.7 | 1 | |
| 265.3 | 1 | |
| 268.5 | 1 | |
| 271.3 | 1 | |
| 283.9 | 1 | |
| 285.5 | 1 | |
| 293.9 | 1 | |
| 294.6 | 1 | |
| 294.9 | 1 |
| Value | Count | Frequency (%) |
| 2594.6 | 1 | |
| 2523 | 1 | |
| 2514.3 | 1 | |
| 2507 | 1 | |
| 2496.8 | 1 | |
| 2465 | 1 | |
| 2450.2 | 1 | |
| 2421.3 | 1 | |
| 2405.6 | 1 | |
| 2370.2 | 1 |
target_carbon_monoxide
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 95 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.086218535 |
| Minimum | 0.1 |
|---|---|
| Maximum | 12.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 0.5 |
| Q1 | 1 |
| median | 1.7 |
| Q3 | 2.8 |
| 95-th percentile | 4.9 |
| Maximum | 12.5 |
| Range | 12.4 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 1.44710922 |
|---|---|
| Coefficient of variation (CV) | 0.6936517894 |
| Kurtosis | 3.096547119 |
| Mean | 2.086218535 |
| Median Absolute Deviation (MAD) | 0.8 |
| Skewness | 1.469212986 |
| Sum | 14835.1 |
| Variance | 2.094125094 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 286 | 4.0% |
| 0.7 | 280 | 3.9% |
| 1.6 | 276 | 3.9% |
| 0.8 | 261 | 3.7% |
| 0.6 | 260 | 3.7% |
| 1.2 | 256 | 3.6% |
| 1.5 | 254 | 3.6% |
| 1.4 | 243 | 3.4% |
| 1.8 | 243 | 3.4% |
| 0.5 | 237 | 3.3% |
| Other values (85) | 4515 |
| Value | Count | Frequency (%) |
| 0.1 | 17 | 0.2% |
| 0.2 | 28 | 0.4% |
| 0.3 | 93 | 1.3% |
| 0.4 | 189 | |
| 0.5 | 237 | |
| 0.6 | 260 | |
| 0.7 | 280 | |
| 0.8 | 261 | |
| 0.9 | 232 | |
| 1 | 286 |
| Value | Count | Frequency (%) |
| 12.5 | 1 | |
| 12 | 1 | |
| 10 | 1 | |
| 9.9 | 2 | |
| 9.8 | 1 | |
| 9.6 | 1 | |
| 9.5 | 1 | |
| 9.1 | 1 | |
| 8.9 | 2 | |
| 8.8 | 2 |
target_benzene
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 405 |
|---|---|
| Distinct (%) | 5.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.23708339 |
| Minimum | 0.1 |
|---|---|
| Maximum | 63.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 1.2 |
| Q1 | 4.5 |
| median | 8.5 |
| Q3 | 14.2 |
| 95-th percentile | 25.05 |
| Maximum | 63.7 |
| Range | 63.6 |
| Interquartile range (IQR) | 9.7 |
Descriptive statistics
| Standard deviation | 7.694425724 |
|---|---|
| Coefficient of variation (CV) | 0.7516228431 |
| Kurtosis | 2.428752107 |
| Mean | 10.23708339 |
| Median Absolute Deviation (MAD) | 4.6 |
| Skewness | 1.324939317 |
| Sum | 72795.9 |
| Variance | 59.20418722 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.1 | 228 | 3.2% |
| 5.6 | 68 | 1.0% |
| 2.8 | 64 | 0.9% |
| 3.8 | 62 | 0.9% |
| 6.7 | 58 | 0.8% |
| 5.2 | 56 | 0.8% |
| 3.1 | 55 | 0.8% |
| 3.9 | 55 | 0.8% |
| 7 | 53 | 0.7% |
| 8.2 | 53 | 0.7% |
| Other values (395) | 6359 |
| Value | Count | Frequency (%) |
| 0.1 | 228 | |
| 0.2 | 2 | < 0.1% |
| 0.3 | 2 | < 0.1% |
| 0.4 | 8 | 0.1% |
| 0.5 | 11 | 0.2% |
| 0.6 | 10 | 0.1% |
| 0.7 | 23 | 0.3% |
| 0.8 | 16 | 0.2% |
| 0.9 | 13 | 0.2% |
| 1 | 16 | 0.2% |
| Value | Count | Frequency (%) |
| 63.7 | 1 | |
| 50.5 | 2 | |
| 49.9 | 2 | |
| 48.7 | 1 | |
| 48.2 | 1 | |
| 48.1 | 1 | |
| 47.8 | 1 | |
| 47.7 | 1 | |
| 47.5 | 1 | |
| 45.4 | 1 |
target_nitrogen_oxides
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 3268 |
|---|---|
| Distinct (%) | 46.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 204.0667839 |
| Minimum | 1.9 |
|---|---|
| Maximum | 1472.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 55.7 KiB |
Quantile statistics
| Minimum | 1.9 |
|---|---|
| 5-th percentile | 31.5 |
| Q1 | 76.45 |
| median | 141 |
| Q3 | 260 |
| 95-th percentile | 605.95 |
| Maximum | 1472.3 |
| Range | 1470.4 |
| Interquartile range (IQR) | 183.55 |
Descriptive statistics
| Standard deviation | 193.9277234 |
|---|---|
| Coefficient of variation (CV) | 0.9503149886 |
| Kurtosis | 6.005939643 |
| Mean | 204.0667839 |
| Median Absolute Deviation (MAD) | 78.1 |
| Skewness | 2.177252062 |
| Sum | 1451118.9 |
| Variance | 37607.9619 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 97 | 14 | 0.2% |
| 109.2 | 14 | 0.2% |
| 44.6 | 14 | 0.2% |
| 100 | 13 | 0.2% |
| 92.1 | 12 | 0.2% |
| 106.1 | 12 | 0.2% |
| 126 | 11 | 0.2% |
| 54.1 | 11 | 0.2% |
| 59.2 | 11 | 0.2% |
| 68.6 | 11 | 0.2% |
| Other values (3258) | 6988 |
| Value | Count | Frequency (%) |
| 1.9 | 1 | |
| 3.9 | 1 | |
| 6.2 | 1 | |
| 6.7 | 1 | |
| 8.3 | 1 | |
| 8.6 | 1 | |
| 9 | 1 | |
| 9.6 | 1 | |
| 10 | 1 | |
| 10.3 | 1 |
| Value | Count | Frequency (%) |
| 1472.3 | 2 | |
| 1405 | 1 | |
| 1358 | 1 | |
| 1354.5 | 1 | |
| 1341.6 | 1 | |
| 1336.2 | 1 | |
| 1327 | 1 | |
| 1304.6 | 1 | |
| 1296.9 | 1 | |
| 1261 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| date_time | deg_C | relative_humidity | absolute_humidity | sensor_1 | sensor_2 | sensor_3 | sensor_4 | sensor_5 | target_carbon_monoxide | target_benzene | target_nitrogen_oxides | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2010-03-10 18:00:00 | 13.1 | 46.0 | 0.7578 | 1387.2 | 1087.8 | 1056.0 | 1742.8 | 1293.4 | 2.5 | 12.0 | 167.7 |
| 1 | 2010-03-10 19:00:00 | 13.2 | 45.3 | 0.7255 | 1279.1 | 888.2 | 1197.5 | 1449.9 | 1010.9 | 2.1 | 9.9 | 98.9 |
| 2 | 2010-03-10 20:00:00 | 12.6 | 56.2 | 0.7502 | 1331.9 | 929.6 | 1060.2 | 1586.1 | 1117.0 | 2.2 | 9.2 | 127.1 |
| 3 | 2010-03-10 21:00:00 | 11.0 | 62.4 | 0.7867 | 1321.0 | 929.0 | 1102.9 | 1536.5 | 1263.2 | 2.2 | 9.7 | 177.2 |
| 4 | 2010-03-10 22:00:00 | 11.9 | 59.0 | 0.7888 | 1272.0 | 852.7 | 1180.9 | 1415.5 | 1132.2 | 1.5 | 6.4 | 121.8 |
| 5 | 2010-03-10 23:00:00 | 11.2 | 56.8 | 0.7848 | 1220.9 | 697.5 | 1417.2 | 1462.6 | 949.0 | 1.2 | 4.4 | 88.1 |
| 6 | 2010-03-11 00:00:00 | 10.7 | 55.7 | 0.7603 | 1244.2 | 669.3 | 1491.2 | 1413.0 | 769.6 | 1.2 | 3.7 | 59.5 |
| 7 | 2010-03-11 01:00:00 | 10.3 | 57.0 | 0.7702 | 1181.4 | 631.7 | 1511.1 | 1359.7 | 715.4 | 1.0 | 3.4 | 63.9 |
| 8 | 2010-03-11 02:00:00 | 10.1 | 62.7 | 0.7648 | 1159.6 | 602.9 | 1610.6 | 1212.2 | 657.2 | 0.9 | 2.2 | 46.4 |
| 9 | 2010-03-11 03:00:00 | 10.5 | 59.6 | 0.7517 | 1030.2 | 521.7 | 1790.2 | 1148.6 | 491.0 | 0.6 | 1.6 | 43.0 |
Last rows
| date_time | deg_C | relative_humidity | absolute_humidity | sensor_1 | sensor_2 | sensor_3 | sensor_4 | sensor_5 | target_carbon_monoxide | target_benzene | target_nitrogen_oxides | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7101 | 2010-12-31 15:00:00 | 12.1 | 29.8 | 0.4160 | 828.4 | 760.0 | 991.0 | 882.0 | 597.9 | 0.9 | 4.3 | 165.9 |
| 7102 | 2010-12-31 16:00:00 | 12.4 | 25.6 | 0.4185 | 926.2 | 746.4 | 843.5 | 974.4 | 769.6 | 1.1 | 5.8 | 199.4 |
| 7103 | 2010-12-31 17:00:00 | 12.1 | 29.3 | 0.4148 | 1000.5 | 883.0 | 834.4 | 926.3 | 913.9 | 1.4 | 7.4 | 228.5 |
| 7104 | 2010-12-31 18:00:00 | 10.2 | 32.0 | 0.4112 | 922.7 | 800.7 | 856.5 | 876.1 | 819.8 | 1.5 | 5.9 | 206.1 |
| 7105 | 2010-12-31 19:00:00 | 9.1 | 34.3 | 0.3958 | 957.9 | 741.9 | 970.3 | 915.1 | 866.0 | 1.2 | 4.9 | 211.0 |
| 7106 | 2010-12-31 20:00:00 | 9.2 | 32.0 | 0.3871 | 1000.5 | 811.2 | 873.0 | 909.0 | 910.5 | 1.3 | 5.1 | 191.1 |
| 7107 | 2010-12-31 21:00:00 | 9.1 | 33.2 | 0.3766 | 1022.7 | 790.0 | 951.6 | 912.9 | 903.4 | 1.4 | 5.8 | 221.3 |
| 7108 | 2010-12-31 22:00:00 | 9.6 | 34.6 | 0.4310 | 1044.4 | 767.3 | 861.9 | 889.2 | 1159.1 | 1.6 | 5.2 | 227.4 |
| 7109 | 2010-12-31 23:00:00 | 8.0 | 40.7 | 0.4085 | 952.8 | 691.9 | 908.5 | 917.0 | 1206.3 | 1.5 | 4.6 | 199.8 |
| 7110 | 2011-01-01 00:00:00 | 8.0 | 41.3 | 0.4375 | 1108.8 | 745.7 | 797.1 | 880.0 | 1273.1 | 1.4 | 4.1 | 186.5 |